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simplex  theory  are  also  accurate  under  many  combinations  of  parameters  and  are  second 
In  accuracy  to  Local  Independence.  These  Indices  decrease  In  accuracy  substantially 
at  the  highest  level  of  factor  correlations  and  the  widest  dispersion  of  Item  difficulties 
that  we  used. 

The  fitting  of  a  simplex  R-matrlx  to  observed  Intercorrelations  of  binary  Items  provides 
high  accuracy  under  a  wide  range  of  parameters,  but  becomes  highly  sensitive  to  the 
combination  of  a  high  level  of  factor  correlations  and  a  wide  distribution  of  Item  diffi¬ 
culties.  An  Increase  In  sample  size  produces  no  Increase  In  accuracy  in  the  most 
unfavorable  combination  of  parameters. 

A  quantitative  Index  of  the  shape  of  the  curve  of  successive  Eigenvalues  was  used  for 
matrices  of  phi  coefficients,  tetrachorlc  correlations,  and  variance-covariance  matrices. 
None  of  these  Indices  produced  satisfactory  accuracy  except  under  most  favorable  combina¬ 
tions  of  parameters .^Tven"~s<n>the  Eigenvalues  of  varlance^covarlance  matrices  provide 
a  more  accurate  basis  for  a  decision  concerning  dimensionality  than  tetrachorlc  correl¬ 
ations,  which  have  been  the  statistics  of  choice.  ,  Tetrachorlcs  are  probably  not  dependable 
for  any  purpose  when  there  Is  a  wide  range  of  Itert  difficulties  (or  popularities)  except 
In  sample  sizes  substantially  larger  than  2,000, 


Introduction 


In  our  research  we  have  Investigated  Indices  of  dimensionality  of  four 
distinct  types  and  reported  on  Indices  representing  three  of  these  types  In 
both  our  first  technical  report  (Tucker,  Humphreys,  and  Roznowskl,  1986),  and 
In  the  second  (Roznowskl,  Humphreys,  and  Tucker,  1987). 

Local  Independence  Indices 

If  a  test  Is  unidimensional,  the  Items  In  the  test  are  Independent  of 
each  other  In  a  sample  of  people  all  at  the  same  level  on  the  latent  trait. 
This  property  of  the  test  Is  local  Independence.  Having  the  same  raw  score  on 
the  test  can  be  used  as  an  estimate  of  having  the  same  score  on  the  latent 
trait.  Starting  from  this  assumption  two  Indices  of  dimensionality  were 
tried.  They  are  called  Local  Independence  Indices. 

Pattern  Indices 

When  the  product-moment  Intercorrelatlons  (phis)  of  a  perfect  Guttman 
scale  are  factored,  the  second  factor  loadings  have  a  distinctive  pattern  of 
signs  and  relatives  sizes.  If  the  obtained  Intercorrelatlons  are  based  on  a 
single  factor  plus  random  error,  the  second  factor  loadings  can  be  expected  to 
approximate  the  expected  pattern.  In  all,  we  tried  four  ways  of  obtaining  a 
quantitative  Index  of  dimensionality  from  the  pattern  of  second  factor 
loadings.  They  are  called  Pattern  Indices. 

Ratio  of  Differences  Indices 

Indices  of  dimensionality  In  the  Intercorrelatlons  of  continuous  variates 
have  long  depended  on  relations  among  the  successive  Eigenvalues  obtained  In  a 
principal  factors  analysis.  Whether  known  as  the  "scree"  test  or  "root 
staring,"  the  assumption  Is  that  there  will  be  a  break  in  the  curve  of  the 
Eigenvalues  at  the  point  the  last  replicable  factor  has  been  extracted. 
Thereafter  the  roots  should  have  little  slope.  For  the  one  factor  case  the 
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ratio  of  the  difference  between  the  first  two  roots  to  subsequent  differences 
(we  chose  the  average  of  the  next  two)  should  be  large.  We  applied  this 
principle  to  the  roots  of  tetrachorlc,  product -moment,  and  variance-covariance 
Matrices.  These  are  the  Ratio  of  Differences  Indices. 

Root  One  Index 

Even  more  popular  than  a  break  In  the  latent  roots  obtained  In  a 
principal  factors  analysis  Is  the  "root  one"  criterion  In  a  principal 
coMponents  analysis.  We  did  not  look  systeMatlcally  at  this  criterion  when 
applied  to  binary  IteMS  for  the  very  good  reason  that  It  was  obviously 
Inappropriate,  given  the  low  level  of  IteM  Intercorrelations  In  typical 
cognitive  tests.  One  need  only  Inspect  a  few  saaples  to  reject  "root  one"  as 
a  basis  for  a  decision  concerning  dimensionality.  The  failure  In  binary  data 
also  Indicates  one  Important  reason  for  failure  In  continuous  data;  that  Is. 
when  replicable  factors  are  determined  by  relatively  small  correlations,  the 
criterion  will  fall. 

S1w)lex  Fitting  Index 

The  Items  In  a  perfect  Guttman  scale  form  an  R-matrlx  that  has  the 
perfect  simplex  form.  It  seemed  possible  that  an  Index  of  dimensionality  from 
the  fit  of  the  simplex  model  to  the  observed  Intercorrelations  could  be 
obtained.  Although  closely  related  In  conception  to  the  Pattern  Indices,  we 
have  placed  It  In  a  fourth  category.  Data  concerning  the  Simplex  Fitting 
Index  have  not  previously  been  made  a  matter  of  record. 

The  Population  Model  Used 

An  empirical  Investigation  of  criteria  to  distinguish  between 
unidimensionality  and  multidimensional Ity  as  revealed  In  the  Intercorrelations 
of  binary  Items  requires  a  population  model  from  which  samples  of  both  people 
and  Items  can  be  drawn.  Our  model  was  described  in  considerable  detail  In  our 
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first  technical  report  (Tucker,  Humphreys,  and  Roznowskl,  1986).  For  present 
purposes  It  Is  sufficient  to  list  the  parameters  of  the  model.  In  our 
research  some  of  the  parameters  were  varied;  others  were  fixed. 

The  parameters  varied  were  as  follows:  sample  size,  number  of  Items, 
distribution  of  Item  difficulties,  number  of  factors,  and  level  of 
Intercorrelatlons  of  the  factors.  We  set  other  parameters  so  that  the  model 
reflected  realistic  psychological  data.  Some  Items  were  loaded  on  a  single 
factor  while  others  were  complex.  A  sufficient  number  of  factorial ly  pure 
Items  was  Included  to  provide  adequate  factor  definition  for  all  numbers  of 
factors.  Provision  was  made  for  success  by  guessing.  The  size  of  Hem 
Intercorrelatlons  was  set  at  levels  typical  of  cognitive  tests  In  which 
substantial  unique  variance  Is  the  norm.  As  number  of  factors,  level  of 
factor  correlations,  and  distributions  of  Item  difficulties  In  the  model  were 
varied,  variation  In  size  of  Item  correlations  could  not  be  held  constant. 
Mean  Item  correlations  are  presented  In  both  published  technical  reports. 
Factors  In  Continuous  Variates 

Insight  Into  the  nature  of  our  model  can  be  obtained  by  factoring 
Intercorrelatlons  of  the  variates  prior  to  the  point  at  which  continuous 
variates  are  dichotomized  to  form  binary  ones.  Table  1  presents  Eigenvalues 
for  three  samples  of  500  each  for  30  variables  for  one  through  five  principal 
factors  In  which  squared  multiple  correlations  served  as  commonality 
estimates.  For  factors  two  through  five  Intercorrelatlons  were  set,  on 
average,  at  .55,  which  we  considered  to  be  at  an  Intermediate  oblique  level. 
Also  included  In  the  table  are  estimated  Eigenvalues  for  random  data  matrices 
based  on  the  parallel  analysis  procedure  of  Montanelll  and  Humphreys  (1976). 
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Table  1 

Successive  Eigenvalues  In  One  to  Five  Factors  In  the 
Continuous  Model  with  Squared  Multiple  Correlations  In  the  Olagonal 

(N*500) 


Factors  Saaples 

1 

Eigenvalues 
2  3 

4 

5 

6 

7 

8 

1 

19.42 

.21 

.15 

.14 

.14 

.10 

.10 

.10 

1  2 

17.68 

.21 

.18 

.18 

.12 

.12 

.10 

.09 

3 

19.32 

.16 

.14 

.14 

.12 

.11 

.09 

.08 

1 

15.27 

3.60 

.17 

.14 

.13 

.09 

.06 

.06 

2  2 

15.08 

3.55 

.14 

.14 

.10 

.09 

.08 

.07 

3 

14.90 

3.27 

.18 

.16 

.13 

.10 

.09 

.08 

1 

14.75 

2.11 

1.51 

.17 

.12 

.11 

.10 

.08 

3  2 

15.13 

2.65 

1.89 

.15 

.12 

.11 

.11 

.08 

3 

15.90 

1.68 

1.41 

.16 

.13 

.12 

.10 

.09 

1 

14.05 

1.91 

1.46 

.66 

.15 

.14 

.10 

.08 

4  2 

15.95 

1.24 

1.08 

.74 

.17 

.14 

.11 

.10 

3 

15.65 

1.08 

.82 

.78 

.18 

.17 

.12 

.10 

1 

13.81 

1.50 

1.14 

.63 

.50 

.18 

.14 

.10 

5  2 

15.22 

1.22 

.98 

.79 

.43 

.12 

.11 

.09 

3 

16.79 

1.19 

.74 

.62 

.43 

.15 

.13 

.11 

Parallel  Analysis 
Estimates 

.57 

.48 

.42 

.38 

.34 

.32 

.27 

.24 
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The  parameters  for  the  tabled  data  were  selected  to  represent 
Intermediate  levels  of  those  used  In  the  research  reported  In  our  second 
technical  report.  For  the  one  factor  case,  however,  we  used  the  level  of  Item 
Interconnections  on  which  the  data  In  the  first  technical  report  were  based. 
(The  later  change  was  made  to  bring  the  one  factor  correlations  more  nearly  In 
line  with  the  multiple  factor  correlations.) 

For  the  continuous  data  of  Table  1  there  are  breaks  In  the  curve  formed 
by  the  successive  Eigenvalues  for  the  proper  number  of  factors  In  each  sample. 
The  parallel  analysis  criterion  also  leads  to  the  expected  number  of  factors. 
Eigenvalues  for  the  same  R-matrlces  In  which  unities  have  been  placed  In  the 
diagonal,  shown  In  Table  2,  have  similar  breaks  to  those  In  Table  1  for  the 
expected  number  of  common  factors,  but  the  "root  one"  criterion  falls  after 
four  common  factors.  Investigators  who  advocate  a  parallel  analysis  criterion 
In  principal  component  analysis  would  commit  additional  errors.  They  would 
frequently  accept  only  three  factors  In  samples  In  which  either  four  or  five 
were  required. 

If  the  factors  were  more  highly  oblique,  or  If  sample  sizes  were  smaller, 
the  number  of  factors  decision  would  have  been  made  with  less  confidence  and 
probably  less  accuracy,  for  the  continuous  variates.  Even  with  the 
parameters  used  the  distribution  of  Eigenvalues  looks  Increasingly 
unldlmenslonal  as  the  number  of  factors  Increases  from  two  to  five.  It  Is 
clear  that  the  number  of  factors  decision  would  be  made  with  more  error  In  all 
combinations  of  parameters  after  the  loss  of  Information  from  the  conversion 
of  continuous  variates  to  binary  ones  and  from  the  Introduction  of  success  by 
guessing. 
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Table  2 

Successive  Eigenvalues  In  One  to  Five  Factors  In  the 


Continuous  Model  with  Unities  In  the  Olagonal 
(N-SOO) 

Eigenvalues 

Factors  Samples 

1 

2 

3 

4 

5 

6 

7 

8 

1 

19.76 

.58 

.54 

.53 

.52 

.50 

.48 

.45 

1  2 

18.07 

.66 

.66 

.64 

.58 

.56 

.53 

.52 

3 

19.66 

.59 

.57 

.52 

.51 

.49 

.47 

.44 

1 

15.63 

3.98 

.61 

.59 

.58 

.52 

.49 

.49 

2  2 

15.45 

3.93 

.58 

.56 

.56 

.52 

.50 

.49 

3 

15.28 

3.67 

.62 

.61 

.60 

.59 

.54 

.52 

1 

15.14 

2.50 

1.93 

.59 

.56 

.53 

.51 

.51 

3  2 

15.48 

2.99 

2.25 

.55 

.53 

.52 

.48 

.46 

3 

16.26 

2.08 

1.79 

.58 

.56 

.55 

.53 

.49 

1 

14.46 

2.31 

1.87 

1.10 

.59 

.58 

.54 

.52 

4  2 

16.33 

1.62 

1.47 

1.13 

.58 

.55 

.54 

.52 

3 

16.04 

1.48 

1.25 

1.19 

.64 

.60 

.58 

.55 

1 

14.24 

1.94 

1.61 

1.06 

.97 

.66 

.63 

.56 

5  2 

15.61 

1.61 

1.39 

1.21 

.89 

.58 

.53 

.51 

3 

17.15 

1.55 

1.12 

.98 

.80 

.54 

.53 

.50 

7 


Evaluation  of  the  Model 

At  this  point  one  eight  ask  the  question  whether  our  model  is  more 
complex  than  most  real  data  matrices  one  Is  likely  to  encounter  In  practice.  j 

Our  answer  Is  negative.  Investigators  typically  find  clearcut  simple 
structure  in  analyses  of  continuous  measures  only  when  they  start  with  a  good 
deal  of  Information  about  their  measures  and  select  carefully  a  battery  of  j 

tests  on  the  basis  of  that  Information.  Just  as  factorial ly  pure  tests  are 
not  In  common  supply,  we  do  not  expect  to  find  commonly  factorial ly  pure 
Items.  j 

Intercorrelations  among  ability  factors  defined  by  continuous  variates 
also  tend  to  be  high  In  wide  ranges  of  talent.  There  are  stable  factor 
correlations  In  the  ASVAB  battery  at  the  level  of  .55,  for  example.  In  which  . 

the  factors  are  defined  by  tests  having  quite  dissimilar  content  such  as 
Arithmetic  Reasoning  and  Mathematics  Knowledge,  on  the  one  hand,  and 
Vocabulary  and  Reading  Comprehension,  on  the  other.  Not  only  is  It  unlikely 

I 

that  these  four  types  of  Items  would  be  found  In  a  single  Item  pool,  but  the 
two  types  that  defined  separate  factors  among  total  scores  are  unlikely  to  be 
found  together.  The  selective  factors  Imposed  on  Item  pools  by  the  test 

I 

constructer's  conceptualization  of  the  test  are  highly  likely  to  produce  high 
levels  of  obliqueness  among  multiple  factors. 

Finally,  the  emphasis  In  Item  Response  Theory  on  equivalent  measurement 
accuracy  at  all  levels  of  ability  requires  Item  pools  having  a  wide  range  of  ■ 

Item  p-values.  Without  a  wide  range  of  difficulties  In  the  pool  Item 
parameters  are  not  well  determined.  Indices  of  dimensionality  are  expected  to 
be  applied  to  pools  of  binary  Items  of  which  many  are  factorlally  complex,  the  * 

multiple  factors  are  highly  correlated,  and  the  Items  are  widely  different  in 
levels  of  difficulty. 

j 

j 


8 


When  cognitive  data  are  viewed  In  teres  of  a  hierarchical  factor  model, 
no  serious  error  Is  coailtted  by  accepting  a  uni dimensional  hypothesis  In  the 
presence  of  multiple  factors  that  are  substantially  oblique.  The  error  becomes 
progressively  less  serious  as  the  number  of  factors  Increases  and  as  the 
factor  correlations  Increase.  Each  of  these  parameters  Increases  the 
contribution  to  variance  of  the  general  factor  In  the  total  score.  Hot  only 
does  the  contribution  to  total  variance  of  a  group  factor  decrease  as  the 
number  of  factors  Increases,  but  the  contribution  to  variance  of  the  sum  of 
five  group  factors  Is  less  than  the  contribution  to  variance  of  the  sum  of  two 
group  factors  of  the  same  degree  of  obliqueness.  It  seems  counterintuitive, 
but  as  long  as  each  Item  measures  the  general  factor,  the  greater  the 
factorial  complexity  of  the  test  Items  the  more  closely  does  the  total  score 
reflect  a  single  dimension.  In  maw,  many  applications  the  most  valid 
dimension  Is  the  one  defined  by  the  factor  correlations.  A  test  constructer 
should  not  choose  to  measure  only  one  of  the  correlated  factors  In  most 
applications. 

Properly  weighted  multiple  dimensions  that  are  substantially  positively 
correlated  are  not  a  problem  for  Item  Response  Theory  when  each  examinee  is 
exposed  to  every  Item  as  In  a  standard  printed  test.  Multiple  dimensions  do 
become  a  problem,  however.  In  adaptive  testing.  The  expected  build-up  of  the 
general  or  common  variance  In  the  total  score  only  occurs  when  all  secondary 
factors  are  adequately  sampled.  One  cannot  rely  on  an  algorithm  for  the 
selection  of  Items  that  depends  only  on  the  "a*  and  NbN  parameters  of  the 
Items  In  the  pool.  Secondary  factors  must  be  known  or  estimated  and  the 
Information  used  In  Item  selection  In  order  to  maximize  the  validity  of  the 
test  as  well  as  to  avoid  the  bias  that  would  result  If  all  examinees  were  not 
measured  on  the  same  dominant  dimension. 


The  remainder  of  this  report  contains  three  sections:  a  discussion  of 
unpublished  data  that  were  reported  at  the  1987  ONR  conference,  an  account  of 
a  generally  unsuccessful  attempt  to  compensate  for  the  negative  effects  on 
Indices  of  dimensionality  of  wide  ranges  of  item  difficulties,  and  a  summary 
of  what  we  have  learned  during  the  project.  Including  our  recommendation 
concerning  a  preferred  Index.  The  latter  Is  supported  In  Appendix  A  by  more 
complete  data  than  we  Included  In  our  second  technical  report. 

Results  Reported  Orally 

Fitting  a  Simplex 

We  obtained  data  for  20  and  30  Items  on  an  Index  In  which  the  simplex 
model  was  applied  directly  to  the  product-moment  correlations  of  the  binary 
Items.  These  data  were  reported  orally  during  the  ONR  conference  at  the 
University  of  South  Carolina  (1987).  This  Index  Is  identified  as  Simplex 
Fitting.  The  Pattern  Indices  which  appeared  In  our  two  technical  reports  are 
based  on  the  characteristics  of  the  principal  factors  extracted  from  a  simplex 
R-matrlx.  We  obtained  least  squares  fits  of  the  simplex  model1  to  observed  R- 
matrlces  for  100  samples  each  of  all  combinations  of  five  factors.  Ns  of  125, 
500,  and  2000,  three  levels  of  distribution  of  Item  difficulties,  and  three 
levels  of  factor  Intercorrelations.  For  both  of  the  last  two  parameters  the 
levels  were  the  same  as  those  used  In  our  second  technical  report. 

The  simplex  model  requires  item  product -moment  correlations  corrected  for 
attenuation  to  have  the  property  that  all  r1k  j  •  zero,  where  P^>Pj>P|(*  By 
fixing  only  two  reliabilities,  those  for  the  easiest  and  for  the  most 
difficult  Items,  It  Is  possible  to  obtain  unique  solutions  for  the  remaining 
n-2  reliabilities  and  for  n-3  true  score  correlations  between  Items  adjacent 


In  difficulty.  Given  unique  estimates  of  adjacent  Item  true  score 
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correlations,  the  remining  correlations  In  the  estimated  R-matrix  are 
themselves  uniquely  determined.  Not  wishing  the  Index  to  be  confounded  with 
the  arbitrary  fixing  of  the  two  reliabilities  and,  in  consequence,  two  true 
score  correlations,  we  decided  at  the  outset  to  eliminate  the  2(n-l)  residuals 
associated  with  the  most  and  least  difficult  items. 

Results 

Tables  3  and  4  contain  statistics  for  20  and  30  Items,  respectively, 
showing  the  degree  of  separation  of  the  sampling  distributions  of  one  factor 
from  factors  two  through  five.  This  statistic  Is  the  same  as  the  one 
described  In  our  second  technical  report.  A  value  of  200  represents  no 
overlap  between  distributions  while  anything  close  to  100  represents 
essentially  zero  separation.  It  Is  seen  that  the  accuracy  with  which  one 
factor  can  be  discriminated  from  multiple  factors  varies  directly  with  sample 
size  and  number  of  Items,  and  Inversely  with  the  number  of  multiple  factors, 
degree  of  obliqueness  of  the  factors,  and  the  dispersion  of  Item  difficulties. 
These  characteristics  are  all  familiar;  they  are  characteristics  of  every 
Index  we  have  tried  although  there  Is  variation  In  sensitivity  to  levels  of 
the  five  parameters  from  one  Index  to  another.  Unfortunately,  Simplex  Fitting 
Is  more  adversely  affected  by  the  dispersion  of  Item  difficulties  than  the 
others  based  on  the  simplex  model.  Given  our  conception  of  Item  pools  in 
which  an  accurate  decision  concerning  dimensionality  1$  required,  the  accuracy 
of  Simplex  Fitting  breaks  down  where  It  Is  needed  most. 
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Table  3 

01  serial  nation  of  One  versus  Two  through  Five  Factors  by  the  Simplex  Fitting  Index 
as  a  Function  of  Factor  Obliqueness,  Item  Difficulty  Distribution,  and  Sample  Size* 

20  Items 


Low 

Intermediate 

High 

Nar 

Med 

Wide 

Nar 

Med 

Wide 

Na: 

Med 

Wide 

1 

vs.  2 

186 

182 

141 

151 

145 

115 

110 

109 

101 

2 

vs.  3 

187 

181 

136 

158 

146 

113 

118 

110 

105 

12S 

1 

vs.  4 

189 

175 

135 

156 

146 

122 

120 

108 

102 

1 

vs.  5 

186 

176 

128 

149 

143 

117 

116 

111 

103 

1 

vs.  2 

200 

195 

162 

198 

185 

126 

187 

172 

110 

1 

vs.  3 

200 

197 

173 

200 

189 

123 

186 

167 

103 

500 

1 

vs.  4 

200 

196 

164 

200 

185 

117 

185 

163 

106 

1 

vs.  5 

200 

194 

154 

195 

182 

115 

178 

164 

106 

1 

vs.  2 

200 

200 

182 

200 

198 

151 

200 

194 

129 

l 

vs.  3 

200 

199 

181 

200 

199 

148 

200 

192 

126 

2000 

1 

vs.  4 

200 

199 

177 

200 

197 

148 

200 

191 

115 

1 

vs.  5 

200 

199 

170 

200 

196 

139 

200 

193 

113 

* 

From  left  to  right  in  the  table,  the  first  column  contains  sample  size,  the  second 
number  of  factors,  and  the  third,  fourth,  and  fifth  distributions  of  Item 
difficulties  for  low  levels  of  obliqueness.  Distributions  of  Item  difficulties  are 
then  repeated  for  Intermediate  and  high  obliqueness. 


I. 
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Table  4 

Discrimination  of  One  versus  Two  through  Five  Factors  by  the  Simplex  Fitting  Index 
as  a  Function  of  Factor  Obliqueness,  Item  Difficulty  Distribution,  and  Sample  Size* 

30  Items 


Low 

Intermediate 

High 

Nar 

Med 

Wide 

Nar 

Med 

Wide 

Nar 

Med 

Wide 

1 

vs. 

2 

188 

188 

147 

165 

154 

123 

122 

113 

103 

1 

vs. 

3 

189 

189 

149 

163 

152 

127 

123 

110 

101 

125 

1 

vs. 

4 

188 

188 

139 

158 

138 

122 

113 

104 

105 

1 

vs. 

5 

182 

182 

135 

157 

141 

117 

110 

110 

101 

1 

vs. 

2 

200 

199 

181 

200 

189 

140 

197 

170 

115 

1 

vs. 

3 

200 

199 

178 

200 

193 

132 

195 

174 

107 

500 

1 

vs. 

4 

200 

199 

171 

200 

188 

129 

192 

156 

105 

1 

vs. 

5 

200 

199 

168 

200 

182 

120 

186 

150 

103 

1 

vs. 

2 

200 

200 

194 

200 

199 

162 

200 

197 

125 

1 

vs. 

3 

200 

200 

192 

200 

199 

168 

200 

196 

117 

2000 

1 

vs. 

4 

200 

200 

191 

200 

200 

149 

200 

197 

114 

1 

vs. 

5 

200 

200 

180 

200 

200 

132 

200 

194 

109 

* 

From  left  to  right  In  the  table,  the  first  column  contains  sample  size,  the  second 
number  of  factors,  and  the  third,  fourth,  and  fifth  distributions  of  Item 
difficulties  for  lorn  levels  of  obliqueness.  Distributions  of  Item  difficulties  are 
then  repeated  for  Intermediate  and  high  obliqueness. 
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Table  5  presents  aeans  and  standard  deviations  of  the  sues  of  squared 
deviations  of  observed  from  expected  for  certain  favorable  coablnatlons  of 
parameters  In  the  trlaaed  20x20  and  30x30  aatrlces.  Simplex  Fitting  can  In 
many  Instances  separate  the  saapllng  distributions  of  one  factor  and  multiple 
factors  very  well  Indeed.  The  Index  Is  a  good  deal  more  variable  In  multiple 
factor  R-matrlces  than  In  the  unldlaenslonal  ones.  It  Is  also  necessary  to 
realize  that  the  Index  Is  not  normally  distributed. 

Table  6  Includes  parallel  data  for  a  selection  of  unfavorable 

combinations  of  parameters.  The  simplex  model  tends  to  fit  correlations 

\ 

produced  by  highly  oblique  multiple  factors  among  Items  varying  widely  In 
difficulty  levels  as  well  (or  as  poorly)  as  It  fits  R-matrlces  In  which  there 
Is  only  one  factor  among  widely  distributed  Items. 

A  Possible  Modification  of  the  Methodology 

Vie  have  considered  whether  the  model  fitting  could  be  Improved  by  fixing 
appropriately  the  true  score  correlations  between  Items  adjacent  In 
difficulty.  In  a  perfect  Guttman  scale  the  proportion  passing  the  more 
difficult  Item  Is  also  the  proportion  passing  both  Items.  One  could  start 
with  a  model  that  fixed  all  such  correlations  at  their  maximum  level  for  a 
unidimensional  test.  Although  we  have  not  tried  this  approach.  It  appears 
from  Inspection  of  the  distributions  of  estimated  adjacent  Item  correlations 
In  which  only  end  reliabilities  were  fixed  that  Improved  accuracy  would 
result;  that  Is,  the  distributions  of  estimated  adjacent  Item  correlations 
have  means  and  variances  that  also  distinguish  between  one  and  multiple 
factors. 


14 


I 

I 


Table  5 

Means  and  Standard  Deviations  of  the  Simplex  Fitting  Index  In 
Marrow  Distributions  of  Item  Difficulties  and  the  Larger  Sables 


Low 

intermediate 

Hlgi 

% 

Sx 

1 

Sx 

% 

Sx 

20  Items 

1 

.180 

.025 

.180 

.025 

.180 

.025 

2 

1.080 

.413 

.556 

.230 

.312 

.084 

SOO  3 

.913 

.317 

.478 

.122 

.289 

.067 

4 

.814 

.216 

.421 

.096 

.280 

.063 

5 

.773 

.239 

.420 

.123 

.268 

.062 

1 

.044 

.007 

.044 

.007 

.044 

.007 

2 

.735 

.312 

.357 

.151 

.196 

.070 

2000  3 

.669 

.194 

.363 

.123 

.168 

.061 

4 

.606 

.161 

.276 

.087 

.149 

.045 

5 

.561 

.188 

.281 

.104 

.141 

.051 

30  Items 

i 

.466 

.035 

.466 

.035 

.466 

.035 

2 

2.532 

.782 

1.302 

.393 

.809 

.207 

500  3 

2.266 

.550 

1.177 

.294 

.757 

.149 

4 

1.742 

.434 

1.017 

.206 

.683 

.109 

5 

1.659 

.402 

.953 

.184 

.632 

.090 

l 

.118 

.011 

.118 

.011 

.118 

.011 

2 

1.803 

.650 

1.021 

.319 

.509 

.146 

2000  3 

1.625 

.434 

.834 

.254 

.416 

.110 

4 

1.347 

.332 

.665 

.184 

.343 

.078 

5 

1.116 

.263 

.539 

.135 

.307 

.067 
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Table  6 

Means  and  Standard  Deviations  of  the  Sleplex  Fitting  Index  In 
Wide  Distributions  of  Item  Difficulties  and  the  Larger  Factor  Correlations 


125 

500 

2000 

X 

Sx 

X 

Sx 

X  Sx 

1 

1.457 

.591 

20  I teas 

.447 

.205 

.173 

.109 

2 

1.458 

.470 

.523 

.196 

.280 

.101 

Inter¬ 

3 

1.470 

.495 

.499 

.222 

.252 

.085 

mediate 

4 

1.534 

.577 

.444 

.140 

.236 

.079 

5 

1.442 

.339 

.457 

.187 

.215 

.086 

1 

1.457 

.591 

.447 

.205 

.173 

.109 

2 

1.302 

.398 

.388 

.127 

.184 

.068 

High 

3 

1.344 

.395 

.396 

.197 

.161 

.052 

4 

1.407 

.541 

.393 

.133 

.152 

.057 

5 

1.410 

.794 

.383 

.154 

.154 

.071 

30  Items 


1 

3.329 

1.102 

.978 

.307 

.392 

.172 

2 

3.708 

1.730 

1.236 

.292 

.681 

.291 

Inter¬ 

3 

3.363 

.532 

1.162 

.277 

.645 

.154 

mediate 

4 

3.336 

.708 

1.055 

.186 

.510 

.100 

5 

3.148 

.682 

.990 

.192 

.462 

.094 

1 

3.329 

1.102 

.978 

.307 

.392 

.172 

2 

3. 165 

.860 

.977 

.275 

.442 

.111 

High 

3 

3.100 

.852 

.886 

.246 

.391 

.117 

4 

3.040 

.641 

.868 

.235 

.360 

.101 

5 

2.894 

.606 

.847 

.205 

.318 

.105 
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Unfortunately,  It  also  looks  as  If  the  greatest  Improvement  In  accuracy  would 
occur  for  combinations  of  parameters  where  the  Index  Is  already  highly  accurate. 

In  high  obliqueness,  wide  distribution  of  Item  difficulties,  and  five  factors,  the 
adjacent  Item  correlations  estimated  by  the  present  methodology  tend  to  converge  on 
values  being  obtained  for  matrices  based  on  a  single  factor.  Perhaps,  however, 
simplex  fitting  with  adjacent  Item  correlations  fixed  as  described  above  should  be 
tried  systematically.  If  a  maximum  likelihood  criterion  of  fitting  were  used,  a 
chi  square  test  of  goodness  of  fit  would  be  available.  This  test,  however,  when 
applied  to  samples  from  a  model  which  Is  uni dimensional  except  for  measurement 
error,  would  vary  as  a  function  of  dispersion  of  Item  difficulties.  Guessing 
contributes  more  error  variance  In  such  data  and  Is  negatively  correlated  with 
total  score. 

Trying  to  Obtain  Something  for  Nothing 
Reduction  In  Spread  of  Item  Difficulties 

Because  the  adverse  effects  of  wide  distributions  of  Item  difficulty  were  so 
severe,  we  decided  to  try  a  method  that  would  sacrifice  power  along  certain 
dimensions  for  a  possible  gain  from  reducing  the  dispersion  of  Item  difficulties. 
We  selected  the  higher  two-thirds  of  the  distribution  of  test  scores  and  scored 

this  sample  on  the  more  difficult  two-thirds  of  the  Items,  and  repeated  the 

procedure  on  the  lower  two-thirds  of  the  scores  and  the  easier  two-thirds  of  the 

Items.  The  overlapping  one-third  of  the  scores  and  Items  from  the  middle  of  the 

two  distributions  was  designed  to  allow  a  test  of  the  hypothesis  that  the  two  sets 
of  Items  were  measuring  the  same  factor  or  factors.  We  sacrificed  sample  size  and 
level  of  Item  Intercorrelatlons  (from  restriction  In  range  of  talent)  for  a 
reduction  In  spread  of  Item  difficulties. 

This  procedure  was  tried  out  as  described  on  a  small  scale  for  a  highly 
unfavorable  combination  of  parameters  that  Included  high  oblique  factors,  a  wide 
range  of  Item  difficulties,  one,  two,  and  five  factors,  30  Items,  and  an  N  of  500. 
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Thus  two  samples  of  approximately  333  cases  each  were  scored  on  different  but 
overlapping  20  Item  tests.  We  coaputed  only  two  Indices  of  dlaenslonallty:  Local 
Independence,  our  aost  accurate  Index  In  the  research  to  date,  and  Slaplex 
Fitting.2 
Results 

Table  7  contains  a  coaparison  of  accuracy  in  both  20  and  30  Items  as 
heretofore  computed  with  accuracy  In  each  of  the  two  20  Itea  tests  selected  by  the 
procedure  described  from  one  30  Item  test.  One  and  five  factors  are  represented  by 
100  replications,  two  factors  by  SO.  In  the  case  of  Local  Independence,  there  Is  a 
reduction  In  accuracy  of  dlscrlalnatlon  for  the  modified  Index  In  coaparison  with 
Its  use  In  the  standard  manner  In  X  Items  and  500  cases.  Approximately  equivalent 
accuracy  Is  obtained  when  the  coaparison  Is  with  the  standard  computations  for  20 
Items  In  500  cases.  Apparently  the  decrease  In  size  of  Item  Intercorrelatlons  In 
the  narrower  range  of  talent  was  a  greater  handicap  than  any  gain  associated  with 
the  decrease  In  dispersion  of  Item  difficulties. 

The  data  for  the  Simplex  Fitting  Index  are  not  as  easy  to  Interpret. 
Programing  for  this  task  was  divided  between  two  persons  who  had  no  opportunity 
for  face-to-face  Interaction.  A  few  changes  In  the  program  were  required.  These 
changes  produced  a  higher  Incidence  of  extreme  outliers.  A  second  difference,  a 
failure  to  exclude  residuals  from  the  easiest  and  most  difficult  Items,  we  were 
able  to  allow  for  In  obtaining  means  and  standard  deviations  In  this  table.  In 
spite  of  the  differences  In  outcomes  for  the  two  sets  of  data  from  the  30  Standard 
condition.  It  appears  that  the  manipulation  of  Items  and  samples  produced  a  small 
Increase  In  accuracy  of  diagnosis  of  dimensionality.  We  attribute  this  outcome  to 
the  high  degree  of  sensitivity  of  this  Index  to  wide  distributions  of  item 
difficulties.  Perhaps  a  further  gain  In  accuracy  could  be  obtained  from  fixing  the 
true  score  correlations  as  described  earlier. 
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Table  7 

Comparison  of  Two  Indices  Computed  In  the  Standard 
Manner  with  Their  Counterparts  after  Selecting  20  Extreme  Items  In 
Each  Olrectlon  from  a  30  Item  Test 


1 

Factors 

2 

5 

X 

Sx 

X 

Sx 

X 

Sx 

Local 

Independence  Index 

30  Standard 

.705 

.030 

.455 

.094 

.680 

.037 

20/30  Difficult 

.614 

.053 

.403 

.105 

.617 

.053 

20/30  Easy 

.627 

.058 

.375 

.121 

.607 

.057 

20  Standard4 

.697 

.039 

.545 

,093 

.658 

.047 

30  Standard4 

.711 

.028 

.507 

.089 

.663 

.042 

Simplex  Fitting  Index*1 

30  Standardc 

.669 

.473 

1.238  1 

.467 

1.073 

1.640 

20/30  Difficult0 

.908 

.314 

1.618 

.490 

1.306 

.506 

20/30  Easy0 

.996 

.398 

1.538 

.532 

1.268 

.532 

20  Standard4 

.503 

.212 

.460 

.153 

.435 

.160 

30  Standard4 

.467 

.139 

.467 

.121 

.408 

.093 

£  From  earlier  computer  runs  Involving  different  samples  of  Items  and  examinees. 

°  The  sum  of  squared  deviations  for  30  Items  was  adjusted  to  the  equivalent  of  a  20 
Item  test. 

The  earlier  computer  program  Incorporated  a  method  for  avoiding  most  extreme 
outliers  In  the  Simplex  Fitting  Index.  Note  the  evidence  for  extreme  skewness 
for  this  Index  In  the  present  computations. 
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What  We  Have  Learned 

The  Recogended  Index 

The  aost  accurate  Index  available  to  an  Investigator,  with  the  possible 
exception  of  full  Information  factor  analysis.  Is  the  one  we  have  called  Local 
Independence.  Tables  showing  the  accuracy  of  this  Index  for  all  combinations 
of  the  parameters  we  varied  appeared  In  our  second  technical  report  and  are 
reprinted  here  as  Tables  8-12.  Although  this  Index  can  be  computed 
Inexpensively,  there  Is  not  a  simple  method  of  using  It  to  make  a  decision 
concerning  dimensionality  In  a  set  of  binary  Items.  We  have  no  formula  that 
allows  estimation  of  means  and  standard  deviations  from  combinations  of 
parameters  using  settings  that  vary  from  the  ones  we  used.  We  do  believe  that 
our  selections  do  come  close  to  defining  the  limits  of  the  domain.  To  support 
use  of  the  Index  we  report  In  Appendix  A  the  means  and  standard  deviations  for 
all  combinations  of  parameters.  In  contrast,  we  Included  only  a  partial  set 
In  the  second  technical  report. 

Computation  of  the  Local  Independence  Index 

We  also  repeat  here  a  step  by  step  review  of  the  methodology  used  In 
obtaining  this  Index. 

1.  Compute  separate  variance-covariance  matrices  for  each  sample  of 
persons  who  have  the  same  total  score  on  the  test.  Note  that  tetrachorlc 
correlations  are  not  Involved. 

2.  Obtain  an  aggregate  variance-covariance  matrix  by  weighting  each 
separate  matrix  by  the  size  of  its  sample. 

3.  Change  signs  of  the  aggregate  matrix  by  row  and  column  until  a  maximum 
algebraic  total  of  the  matrix,  excluding  Its  principal  diagonal.  Is  obtained. 
This  step  Is  accomplished  by  a  routine  parallel  In  every  way  to  the  one  used 
In  centroid  factor  analysis. 

4.  The  algebraic  sum  Is  now  subtracted  from  the  absolute  sum  of  the 
residuals.  This  Is  the  local  Independence  Index. 
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Table  8 

Discrimination  of  One  Versus  Two  through  Five  Factors 
by  the  Local  Independence  Index  as  a  Function  of 
Factor  Obliqueness,  Item  Difficulty,  and  Sample  Size 

20  Items 

LOU  INTERMEDIATE  HIGH 


Narrow  Medium  Wide  Narrow  Medium  Wide  Narrow  Medium  Wide 


125 

1 

V. 

2 

186 

186 

174 

173 

169 

145 

149 

132 

121 

1 

V. 

3 

181 

174 

158 

149 

143 

133 

127 

116 

118 

1 

V. 

4 

163 

162 

148 

141 

135 

133 

126 

113 

110 

1 

V. 

5 

160 

164 

152 

139 

132 

132 

121 

108 

108 

500 

1 

V. 

2 

200 

200 

200 

200 

199 

193 

198 

194 

182 

1 

V. 

3 

199 

198 

197 

191 

195 

183 

190 

183 

159 

1 

V. 

4 

196 

195 

193 

188 

192 

173 

176 

172 

145 

1 

V. 

5 

193 

197 

183 

175 

181 

155 

166 

154 

136 

2000 

1 

V. 

2 

200 

200 

200 

200 

200 

197 

200 

200 

195 

1 

V. 

3 

200 

200 

200 

200 

200 

196 

200 

197 

191 

1 

V. 

4 

200 

200 

198 

199 

200 

194 

199 

199 

181 

1 

V. 

5 

200 

200 

196 

199 

200 

189 

199 

195 

180 

Note:  200  Indicates  total  Independence  of  distributions. 
100  indicates  total  overlap. 
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Table  9 

Discrimination  of  One  Versus  Two  through  Five  Factors 
by  the  Local  Independence  Index  as  a  Function  of 
Factor  Obliqueness,  Item  Difficulty,  and  Sample  Size 

30  Items 

LOU  INTERMEDIATE  HIGH 


Narrow  Medium  Wide  Narrow  Medium  Wide  Narrow  Medium  Wide 


125 

1 

V. 

2 

197 

196 

190 

188 

178 

165 

164 

151 

131 

1 

V. 

3 

192 

186 

182 

178 

160 

158 

146 

123 

124 

1 

V. 

4 

185 

181 

177 

156 

144 

136 

131 

119 

117 

1 

V. 

5 

173 

172 

169 

149 

135 

145 

125 

115 

107 

500 

1 

V. 

2 

200 

200 

200 

200 

200 

200 

199 

200 

197 

1 

V. 

3 

200 

200 

200 

196 

200 

195 

195 

188 

177 

1 

V. 

4 

198 

199 

199 

194 

195 

188 

187 

195 

157 

1 

V. 

5 

195 

199 

197 

192 

190 

176 

170 

171 

154 

2000 

1 

V. 

2 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

3 

200 

200 

200 

200 

200 

200 

200 

200 

196 

1 

V. 

4 

200 

200 

199 

199 

200 

199 

200 

200 

192 

1 

V. 

5 

200 

200 

200 

200 

200 

196 

199 

200 

189 

Note:  200  Indicates  total  Independence  of  distributions. 
100  Indicates  total  overlap. 


Table  10 


Discrimination  of  One  Versus  Two  through  Five  Factors 
by  the  Local  Independence  Index  as  a  Function  of 
Factor  Obliqueness,  Item  Difficulty,  and  Sample  Size 

40  Items 

LOW  INTERMEDIATE  HIGH 


Narrow  Medium  Wide  Narrow  Medium  Wide  Narrow  Medium  Wide 


125 

1 

V. 

2 

197 

199 

196 

195 

189 

181 

175 

168 

140 

1 

V. 

3 

194 

191 

187 

184 

176 

165 

147 

148 

123 

1 

V. 

4 

188 

183 

179 

161 

164 

139 

136 

142 

112 

1 

V. 

5 

177 

187 

169 

156 

154 

123 

128 

125 

111 

500 

1 

V. 

2 

200 

200 

200 

200 

200 

199 

200 

199 

200 

1 

V. 

3 

200 

200 

200 

200 

199 

197 

198 

196 

191 

1 

V. 

4 

200 

200 

200 

200 

196 

196 

194 

192 

167 

1 

V. 

5 

199 

197 

199 

197 

197 

182 

187 

172 

154 

2000 

1 

V. 

2 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

3 

200 

200 

200 

200 

200 

200 

200 

200 

199 

1 

V. 

4 

200 

200 

200 

200 

200 

199 

200 

200 

198 

1 

V. 

5 

200 

200 

200 

200 

200 

199 

199 

200 

196 

Note:  200  indicates  total  independence  of  distributions. 
100  Indicates  total  overlap. 
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Table  11 

Olscrlalnatlon  of  One  Versus  Two  through  Five  Factors 
by  the  Local  Independence  Index  as  a  Function  of 
Factor  Obliqueness,  I  tee  Difficulty,  and  Sample  Size 

50  Items 

LOW  INTERMEDIATE  HIGH 


Narrow  Medium  Wide  Narrow  Medium  Wide  Narrow  Medium  Wide 


125 

1 

V. 

2 

200 

200 

198 

196 

191 

174 

176 

169 

149 

1 

V. 

3 

200 

197 

187 

187 

184 

162 

162 

149 

131 

1 

V. 

4 

197 

191 

183 

174 

168 

138 

152 

133 

113 

1 

V. 

5 

187 

184 

177 

170 

151 

134 

141 

124 

106 

500 

1 

V. 

2 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

3 

200 

200 

200 

200 

200 

199 

199 

200 

192 

1 

V. 

4 

200 

200 

199 

199 

200 

198 

192 

195 

185 

1 

V. 

5 

199 

200 

198 

199 

197 

191 

185 

186 

174 

2000 

1 

V. 

2 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

3 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

4 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

5 

200 

200 

200 

200 

200 

200 

200 

200 

200 

Note:  200  indicates  total  Independence  of  distributions. 
100  Indicates  total  overlap. 
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Table  12 

Olscrlalnatlon  of  One  Versus  Two  through  Five  Factors 
by  the  Local  Independence  Index  as  a  Function  of 
Factor  Obliqueness,  Item  Difficulty,  and  Saaple  Size 

60  Iteas 

LOW  INTERMEDIATE  HIGH 


Narrow  Medlua  Wide  Narrow  Medium  Wide  Narrow  Medium  Wide 


125 

1 

V. 

2 

200 

199 

199 

198 

195 

182 

189 

176 

160 

1 

V. 

3 

198 

200 

196 

191 

192 

172 

171 

152 

133 

1 

V. 

4 

195 

195 

187 

184 

175 

156 

162 

137 

118 

1 

V. 

5 

191 

193 

182 

176 

155 

140 

146 

123 

114 

500 

1 

V. 

2 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

3 

200 

200 

200 

200 

200 

200 

200 

200 

198 

1 

V. 

4 

200 

200 

200 

200 

200 

199 

199 

199 

187 

l 

V. 

5 

199 

200 

200 

200 

199 

197 

192 

190 

176 

2000 

1 

V. 

2 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

3 

200 

200 

200 

200 

200 

200 

200 

200 

200 

1 

V. 

4 

200 

200 

200 

200 

200 

200 

200 

200 

200 

l 

V. 

5 

200 

200 

200 

200 

200 

200 

200 

200 

199 

Note:  200  Indicates  total  Independence  of  distributions. 
100  indicates  total  overlap. 


If  the  difference  between  the  absolute  and  algebraic  sums  is  small,  there  is 
order  In  the  aggregate  matrix  arising  from  more  than  one  factor  In  the  covariances; 
that  Is,  the  Items  of  persons  having  the  same  levels  of  estimated  ability  are  not 
locally  Independent.  Given  local  Independence,  however,  there  Is  no  order  in  that 
matrix,  and  the  difference  between  the  absolute  and  algebraic  sums  is  large. 
Properties  of  the  Index 

It  was  not  surprising  that  Local  Independence  followed  the  pattern  expected  of 
sample  statistics  and  became  more  accurate  with  Increases  In  sample  size.  The 
number  of  Items  In  the  pool  was  also  an  Important  parameter,  but  its  contribution 
to  accuracy  Is  presumably  by  way  of  the  number  of  Items  per  factor.  The  latter  Is 
a  possible  parameter  that  was  not  varied  Independently  of  number  of  Items.  A  test 
constructer  starts  with  a  conception  of  a  test  and  writes  multiple  Items.  If  the 
conception  produces  multiple  factors,  the  difference  In  number  of  factors  In  pools 
of  20  and  60  Items  based  on  the  same  conception  should  be  trivial. 

It  was  also  not  surprising  that  the  number  of  factors  In  the  model  was 
negatively  related  to  the  degree  of  separation  of  the  sampling  distributions  of  one 
and  multiple  factors.  As  described  earlier,  when  multiple  factors  are 
substantially  oblique  the  total  score  becomes  Increasingly  a  more  accurate  measure 
of  the  second-order  factor  defined  by  the  multiple  first-order  factors. 

The  parameters  of  distribution  of  Item  difficulties  and  of  factor 
Intercorrelatlons  are  almost  Impossible  to  handle  in  any  general Izable  way.  We 
arbitrarily  defined  three  levels  of  each.  These  levels  are  briefly  described  here 
as  well  as  In  our  second  technical  report.  Factor  Intercorrelatlons:  low  .35, 
Intermediate  .55,  and  high  .70,  each  on  average.  Item  difficulty  distribution 
means  and  standard  deviations  In  that  order,  respectively.  In  normal  deviate  units: 
narrow  -.13  and  .32,  medium  .00  and  .50,  and  wide  .10  and  .80.  Item  difficulty 
distribution  means  and  standard  deviations  In  proportion  correct  following  the 
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effects  of  guessing:  narrow  .615  and  .113,  medium  .560  and  .155,  and  wide  .567  and 
.215. 

Local  Independence  shows  less  separation  of  one  from  Multiple  factors  as  both 
factor  obliqueness  and  dispersion  of  IteM  difficulties  Increase,  but  It  Is  a  well- 
behaved  Index,  lacking  conplex  Interactions  among  the  five  parameters.  The 
coMblnatlon  of  60  Items  In  a  sample  of  2,000  allowed  this  Index  to  achieve 
virtually  complete  separation  of  one  from  multiple  factors  under  the  most 
unfavorable  combination  of  the  other  parameters. 

Indices  Dependent  on  Tetrachorlc  Correlations 

We  have  also  learned  that  relationships  among  successive  Eigenvalues  of  R- 
matrlces  composed  of  tetrachorlc  correlations  having  either  unities  or  estimated 
comunalltles  down  the  principal  diagonal  are  a  highly  undependable  guide  to  the 
dimensionality  of  binary  Items.  The  Information  about  dimensionality  obtained  from 
the  Eigenvalues  of  tetrachorlcs  Is  only  a  little  more  accurate  than  that  obtained 
from  product -moment  correlations  (phis)  and  is  less  accurate,  within  the  limits  of 
the  values  of  the  parameters  we  used,  than  the  application  of  the  same  Index  to  the 
Eigenvalues  of  the  variance-covariance  matrix.  Data  to  support  these  assertions 
were  reported  in  technical  reports  1  and  2  and  are  not  repeated  here. 

Of  course.  Indices  based  on  tetrachorlcs  do  Improve  in  accuracy  as  sample  size 
Increases.  It  Is  possible  that  the  accuracy  of  Indices  Involving  tetrachorlcs 
might  become  competitive  with  others  If  sample  size  were  Increased  substantially 
above  our  maximum  value  of  2,000,  but  this  can  hardly  be  recommended  to  test 
constructed.  The  heart  of  the  problem  Is  the  sampling  error  of  a  tetrachorlc 
correlation,  which  Is  a  function  of  sample  size,  size  of  the  population 
correlation,  and  the  two  Item  difficulties.  If,  for  example,  the  correlation  Is 
between  an  easy  Item  (p  ■  .9)  and  a  difficult  Item  (p  -  .1),  a  shift  of  one  case  in 
a  hundred  changes  a  correlation  of  +1.0  to  .0. 
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More  generally,  we  have  not  found  any  way  of  Manipulating  Eigenvalues  that  Is 
competitive  In  accuracy  to  Indices  In  other  categories  over  the  full  range  of 
combinations  of  our  parameters.  The  best  that  can  be  said  for  Indices  In  this 
category  Is  that  they  are  not  as  sensitive  to  decreases  In  sample  sl2e  and  number 
of  Items  as  the  rest. 

Indices  Based  on  Simplex  Theory 

Both  the  Pattern  Indices  and  Simplex  Fitting  are  Intermediate  In  accuracy 
overall  to  Local  Independence  and  Ratio  of  Differences  Indices.  The  former  two, 
however,  performed  In  a  disappointing  fashion  under  combinations  of  high  levels  of 
factor  Intercorrelatlons  and  a  wide  dispersion  of  Item  difficulties.  There  also 
seems  to  be  an  Interaction  of  these  parameters  with  sample  size  In  the  direction  of 
little  or  no  gain  In  accuracy  with  Increase  In  M  In  unfavorable  combinations. 

Early  In  our  research,  when  we  were  using  only  the  low  and  Intermediate  levels  of 
factor  Intercorrelatlons,  the  Pattern  Indices  were  found  to  be  less  sensitive  to 
dispersion  of  Item  difficulties  than  Local  Independence.  Extrapolation  to  larger 
Ns  and  more  Items  than  we  had  used  up  to  that  point  In  time  suggested  more  promise 
In  these  approaches  than  was  eventually  realized.  Apparently  the  second-order 
factor  becomes  more  and  more  evident  In  the  correlations  under  the  conditions 
described.  The  Local  Independence  index,  however.  Is  able  to  find  order  In  the 
multiple  factor  situation  In  spite  of  the  simplex  appearance  of  the  R-matrlx. 
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Footnotes 

The  authors  wish  to  express  their  thanks  to  Or.  Timothy  Oavey  for  his  help 
In  programing  the  Simplex  Fitting  Index. 

Mr.  Gary  Thomasson  made  a  major  contribution  to  the  completion  of  this  task. 
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Table  A-l 

Means  and  Standard  Deviations  of  the  Local  Independence  Index 
for  All  Combinations  of  Parameters* 

20  Items 


Low 

Intermediate 

High 

Nar 

Med 

Wide 

Nar 

Med 

wide 

Nar 

Med 

Wide 

1 

580 

579 

554 

580 

579 

554 

580 

579 

554 

055 

053 

064 

055 

053 

064 

055 

053 

064 

2 

301 

338 

357 

422 

433 

475 

501 

527 

533 

118 

117 

121 

105 

113 

092 

081 

087 

077 

125 

3 

377 

420 

432 

484 

509 

500 

543 

561 

543 

112 

099 

095 

106 

082 

079 

073 

064 

069 

4 

450 

459 

460 

533 

531 

511 

554 

564 

550 

099 

091 

094 

062 

069 

074 

060 

064 

061 

5 

479 

460 

459 

532 

540 

509 

565 

576 

555 

085 

095 

090 

069 

059 

076 

047 

061 

055 

1 

699 

707 

697 

699 

707 

697 

699 

707 

697 

039 

034 

039 

039 

034 

039 

039 

034 

039 

2 

189 

260 

331 

298 

370 

447 

434 

472 

545 

077 

085 

100 

105 

099 

116 

095 

100 

093 

500 

3 

383 

388 

454 

450 

486 

535 

535 

578 

611 

091 

102 

082 

104 

093 

088 

077 

076 

063 

4 

478 

497 

519 

535 

546 

586 

603 

613 

642 

077 

080 

072 

081 

071 

067 

068 

067 

062 

5 

541 

541 

567 

587 

600 

632 

625 

648 

658 

057 

060 

065 

060 

054 

052 

050 

047 

047 

1 

818 

819 

805 

818 

819 

805 

818 

819 

805 

021 

021 

032 

021 

021 

032 

021 

021 

032 

2 

169 

218 

298 

274 

319 

436 

376 

443 

545 

077 

076 

105 

101 

115 

111 

106 

107 

096 

2000 

3 

374 

415 

456 

398 

471 

557 

523 

589 

643 

110 

102 

095 

098 

088 

086 

105 

090 

072 

4 

471 

501 

532 

527 

575 

608 

621 

649 

703 

085 

084 

078 

081 

074 

093 

073 

060 

063 

5 

542 

574 

606 

599 

627 

674 

665 

694 

725 

067 

064 

060 

054 

059 

047 

047 

042 

045 

*  From  left  to  right  In  the  table,  the  first  column  contains  sample  size,  the 
second  number  of  factors,  and  the  third,  fourth,  and  fifth  distributions  of  Item 
difficulties  for  low  levels  of  obliqueness.  Distributions  of  Item  difficulties 
are  then  repeated  for  Intermediate  and  high  obliqueness.  The  first  row  opposite 
the  factor  number  contains  means,  the  second  the  standard  deviations.  Both  are 
reported  to  three  significant  decimal  places,  but  decimal  points  are  omitted  to 
reduce  crowding. 


Table  A-2 


Means  and  Standard  Deviations  of  the  Local  Independence  Index 
for  All  Combinations  of  Parameters* 

30  Items 


12S 


500 


2000 


Nar 

Low 

Med 

Wide 

Intermediate 

Nar  Med  wide 

Nar 

High 

Med 

Wide 

1 

628 

623 

602 

628 

623 

602 

628 

623 

602 

040 

045 

043 

040 

045 

043 

040 

045 

043 

2 

302 

320 

379 

417 

451 

487 

516 

542 

567 

103 

105 

098 

101 

098 

092 

088 

080 

070 

3 

414 

426 

442 

496 

525 

518 

572 

599 

582 

094 

088 

086 

078 

079 

081 

065 

055 

056 

4 

482 

482 

473 

548 

562 

565 

596 

604 

591 

073 

072 

077 

066 

070 

060 

061 

044 

045 

5 

510 

510 

493 

573 

583 

558 

606 

609 

603 

068 

067 

077 

058 

057 

054 

042 

042 

045 

1 

706 

710 

711 

706 

710 

711 

706 

710 

711 

028 

028 

028 

028 

028 

028 

028 

028 

028 

2 

156 

193 

270 

263 

303 

389 

375 

438 

507 

062 

079 

091 

085 

093 

083 

101 

083 

089 

3 

364 

388 

409 

447 

454 

510 

521 

548 

603 

095 

090 

089 

089 

081 

079 

083 

064 

063 

4 

464 

463 

490 

530 

546 

568 

580 

617 

652 

075 

079 

075 

069 

065 

077 

061 

067 

052 

5 

503 

529 

541 

559 

590 

618 

628 

642 

663 

071 

065 

067 

064 

053 

057 

050 

044 

042 

I 

803 

808 

797 

803 

808 

797 

803 

808 

797 

018 

017 

025 

018 

017 

025 

018 

017 

025 

2 

113 

135 

208 

164 

217 

332 

270 

333 

440 

055 

067 

077 

061 

079 

108 

083 

088 

091 

3 

348 

367 

430 

386 

427 

472 

474 

506 

593 

116 

099 

101 

086 

104 

086 

089 

075 

079 

4 

450 

451 

514 

481 

503 

572 

550 

584 

655 

091 

081 

084 

077 

070 

079 

076 

059 

059 

5 

497 

524 

563 

550 

582 

631 

608 

650 

696 

065 

072 

072 

065 

067 

056 

061 

051 

045 

*  From  left  to  right  In  the  table,  the  first  column  contains  sample  size,  the 
second  maber  of  factors,  and  the  third,  fourth,  and  fifth  distributions  of  Item 
difficulties  for  low  levels  of  obliqueness.  Distributions  of  Item  difficulties 
are  then  repeated  for  Intermediate  and  high  obliqueness.  The  first  row  opposite 
the  factor  number  contains  means,  the  second  the  standard  deviations.  Both  are 
reported  to  three  significant  decimal  places,  but  decimal  points  are  omitted  to 
reduce  crowding. 
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Table  A-3 

Means  and  Standard  Deviations  of  the  Local  Independence  Index 
for  All  Combinations  of  Parameters* 

40  Items 


Low 

Intermediate 

High 

Nar 

Med 

Wide 

Nar 

Med 

Wide 

Nar 

Med 

Wide 

1 

655 

654 

628 

655 

654 

628 

655 

654 

628 

037 

033 

040 

037 

033 

040 

037 

033 

040 

2 

327 

348 

375 

426 

469 

485 

533 

555 

583 

096 

098 

092 

099 

099 

083 

072 

072 

062 

125 

3 

428 

459 

464 

511 

546 

539 

594 

608 

605 

088 

096 

078 

076 

069 

062 

058 

052 

048 

4 

488 

505 

504 

575 

582 

583 

616 

626 

620 

072 

085 

074 

053 

058 

049 

050 

037 

044 

5 

546 

534 

532 

592 

602 

600 

628 

633 

623 

056 

059 

063 

049 

047 

051 

045 

037 

035 

1 

723 

724 

718 

723 

724 

718 

723 

724 

718 

023 

024 

024 

023 

024 

024 

023 

024 

024 

2 

150 

171 

256 

230 

291 

369 

340 

401 

477 

056 

061 

081 

065 

087 

089 

079 

094 

074 

500 

3 

371 

378 

419 

424 

463 

496 

501 

541 

585 

098 

091 

087 

087 

082 

080 

075 

077 

067 

4 

470 

458 

505 

504 

536 

574 

581 

607 

647 

074 

069 

062 

072 

071 

055 

067 

050 

047 

5 

516 

533 

548 

559 

586 

614 

631 

652 

670 

062 

063 

056 

058 

055 

051 

050 

051 

036 

1 

797 

802 

797 

797 

802 

797 

797 

802 

797 

016 

015 

022 

016 

015 

022 

016 

015 

022 

2 

083 

101 

162 

133 

161 

256 

226 

282 

372 

035 

042 

056 

052 

060 

086 

072 

081 

088 

2000 

3 

357 

360 

383 

385 

386 

461 

422 

475 

540 

116 

100 

084 

100 

099 

093 

078 

083 

074 

4 

434 

440 

466 

470 

497 

552 

539 

563 

616 

088 

081 

072 

077 

078 

068 

061 

060 

055 

5 

490 

507 

543 

534 

548 

599 

598 

627 

667 

073 

079 

068 

067 

065 

068 

065 

047 

053 

*  From  left  to 

right  In  the  table,  the 

first  column  contains 

sample  : 

size. 

the 

second  number  of  factors,  and  the  third,  fourth,  and  fifth  distributions  of  Item 
difficulties  for  low  levels  of  obliqueness.  Distributions  of  Item  difficulties 
are  then  repeated  fojr  Intermediate  and  high  obliqueness.  The  first  row  opposite 
the  factor  number  contains  means,  the  second  the  standard  deviations.  Both  are 
reported  to  three  significant  decimal  places,  but  decimal  points  are  omitted  to 
reduce  crowding. 


Table  A-4 


Means  and  Standard  Deviations  of  the  Local  Independence  Index 
for  All  Combinations  of  Parameters* 

50  Items 


Low 

Intermediate 

High 

Nar 

Med 

Wide 

Nar 

Med 

Wide 

Nar 

Med 

Wide 

1 

681 

672 

646 

681 

672 

646 

681 

672 

646 

028 

029 

036 

028 

029 

036 

028 

029 

036 

2 

311 

333 

395 

437 

475 

509 

554 

576 

588 

099 

085 

090 

097 

097 

092 

080 

066 

060 

125 

3 

441 

453 

483 

532 

553 

558 

606 

628 

616 

074 

087 

079 

071 

065 

065 

067 

046 

044 

4 

508 

526 

525 

590 

597 

603 

631 

644 

635 

064 

068 

063 

057 

055 

051 

044 

039 

039 

5 

542 

553 

538 

615 

621 

610 

650 

654 

643 

067 

064 

058 

048 

049 

047 

037 

036 

038 

1 

734 

737 

733 

734 

737 

733 

734 

737 

733 

023 

022 

020 

023 

022 

020 

023 

022 

020 

2 

048 

164 

233 

230 

277 

345 

353 

385 

470 

247 

047 

061 

070 

081 

085 

076 

066 

072 

500 

3 

380 

389 

431 

429 

450 

507 

514 

539 

592 

097 

081 

081 

076 

075 

077 

073 

061 

061 

4 

458 

470 

492 

513 

540 

575 

598 

617 

647 

072 

074 

068 

070 

062 

053 

059 

048 

044 

5 

513 

528 

553 

577 

591 

626 

636 

649 

681 

064 

061 

064 

053 

052 

051 

054 

041 

034 

1 

798 

801 

795 

798 

801 

795 

798 

801 

795 

014 

015 

017 

014 

015 

017 

014 

015 

017 

2 

067 

089 

139 

122 

142 

219 

195 

236 

322 

030 

032 

048 

048 

050 

069 

058 

065 

080 

2000 

3 

359 

363 

390 

371 

409 

447 

427 

458 

520 

090 

091 

103 

102 

098 

089 

091 

077 

070 

4 

439 

434 

473 

473 

479 

532 

534 

543 

602 

088 

078 

076 

078 

064 

068 

068 

062 

060 

6 

494 

497 

522 

528 

551 

576 

584 

617 

657 

068 

066 

060 

061 

062 

059 

050 

049 

045 

*  From  left  to  right  In  the  table,  the  first  column  contains  sample  size,  the 
second  mater  of  factors,  and  the  third,  fourth,  and  fifth  distributions  of  Item 
difficulties  for  low  levels  of  obliqueness.  Distributions  of  Item  difficulties 
are  then  repeated  for  Intermediate  and  high  obliqueness.  The  first  row  opposite 
the  factor  number  contains  means,  the  second  the  standard  deviations.  Both  are 
reported  to  three  significant  decimal  places,  but  decimal  points  are  omitted  to 
reduce  crowding. 
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Table  A-5 

Means  and  Standard  Deviations  of  the  Local  Independence  Index 
for  All  Combinations  of  Parameters* 

60  Items 


125 


500 


2000 


Nar 

Low 

Med 

Wide 

Intermediate 

Nar  Med  Wide 

Nar 

High 

Med 

Wide 

1 

693 

690 

664 

693 

690 

664 

693 

690 

664 

024 

024 

030 

024 

024 

030 

024 

024 

030 

2 

326 

362 

377 

449 

488 

517 

555 

588 

594 

086 

093 

103 

084 

091 

083 

075 

069 

062 

3 

467 

472 

467 

552 

561 

577 

619 

637 

631 

075 

073 

073 

072 

058 

059 

057 

048 

051 

4 

522 

523 

525 

596 

612 

606 

645 

661 

653 

070 

067 

064 

053 

052 

050 

042 

036 

037 

5 

565 

569 

551 

622 

638 

632 

666 

672 

655 

062 

049 

057 

047 

043 

036 

028 

030 

033 

1 

753 

754 

747 

753 

754 

747 

753 

754 

747 

018 

016 

020 

018 

016 

020 

018 

016 

020 

2 

135 

168 

225 

223 

253 

351 

328 

382 

459 

046 

055 

065 

055 

069 

073 

074 

086 

074 

3 

369 

393 

429 

436 

454 

501 

504 

542 

594 

094 

080 

073 

087 

074 

071 

067 

066 

065 

4 

468 

483 

490 

509 

550 

589 

596 

613 

656 

072 

059 

063 

053 

053 

053 

061 

049 

042 

5 

532 

538 

575 

583 

596 

627 

642 

664 

669 

061 

055 

056 

048 

052 

055 

048 

043 

032 

1 

801 

803 

797 

801 

803 

797 

801 

803 

797 

014 

014 

014 

014 

014 

014 

014 

014 

014 

2 

062 

086 

126 

117 

137 

206 

181 

223 

321 

023 

034 

037 

039 

043 

073 

056 

069 

076 

3 

367 

361 

379 

371 

410 

442 

439 

444 

520 

096 

101 

103 

098 

087 

089 

076 

085 

075 

4 

447 

453 

488 

470 

476 

521 

529 

542 

604 

076 

069 

071 

074 

068 

062 

065 

064 

052 

5 

506 

501 

537 

506 

544 

572 

589 

608 

648 

071 

060 

056 

059 

060 

055 

053 

053 

048 

*  From  left  to  right  In  the  table,  the  first  column  contains  sample  size,  the 
second  number  of  factors,  and  the  third,  fourth,  and  fifth  distributions  of  Item 
difficulties  for  low  levels  of  obliqueness.  Distributions  of  Item  difficulties 
are  then  repeated  for  Intermediate  and  high  obliqueness.  The  first  row  opposite 
the  factor  number  contains  means,  the  second  the  standard  deviations.  Both  are 
reported  to  three  significant  decimal  places,  but  decimal  points  are  omitted  to 
reduce  crowding. 
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